Active learning for misspecified generalized linear models

نویسنده

Francis R. Bach

چکیده

Active learning refers to algorithmic frameworks aimed at selecting training data points in order to reduce the number of required training data points and/or improve the generalization performance of a learning method. In this paper, we present an asymptotic analysis of active learning for generalized linear models. Our analysis holds under the common practical situation of model misspecification, and is based on realistic assumptions regarding the nature of the sampling distributions, which are usually neither independent nor identical. We derive unbiased estimators of generalization performance, as well as estimators of expected reduction in generalization error after adding a new training data point, that allow us to optimize its sampling distribution through a convex optimization problem. Our analysis naturally leads to an algorithm for sequential active learning which is applicable for all tasks supported by generalized linear models (e.g., binary classification, multi-class classification, regression) and can be applied in non-linear settings through the use of Mercer kernels.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Learning for Misspecified Models

Active learning is the problem in supervised learning to design the locations of training input points so that the generalization error is minimized. Existing active learning methods often assume that the model used for learning is correctly specified, i.e., the learning target function can be expressed by the model at hand. In many practical situations, however, this assumption may not be fulf...

متن کامل

Empirical best linear unbiased prediction in misspecified and improved panel data models with an application to gasoline demand

Misspecifications in econometric models can result in misestimated coefficients.An improved method for specifying econometric models is presented. The mean square error of an empirical best linear unbiased predictor of an individual drawing for the dependent variable of an improved model is derived. These ideas are illustrated using certain misspecified and improved models of the demand for gas...

متن کامل

Misspecified Linear Bandits

We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold under the assumption that the arms expected rewards are perfectly linear in their features. It is, however, of interest to investigate the impact of potentia...

متن کامل

Dust source mapping using satellite imagery and machine learning models

Predicting dust sources area and determining the affecting factors is necessary in order to prioritize management and practice deal with desertification due to wind erosion in arid areas. Therefore, this study aimed to evaluate the application of three machine learning models (including generalized linear model, artificial neural network, random forest) to predict the vulnerability of dust cent...

متن کامل

Robust prediction and extrapolation designs for misspecified generalized linear regression models

We study minimax robust designs for response prediction and extrapolation in biased linear regression models. We extend previous work of others by considering a nonlinear fitted regression response, by taking a rather general extrapolation space and, most significantly, by dropping all restrictions on the structure of the regressors. Several examples are discussed. © 2007 Elsevier B.V. All righ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Active learning for misspecified generalized linear models

نویسنده

چکیده

منابع مشابه

Active Learning for Misspecified Models

Empirical best linear unbiased prediction in misspecified and improved panel data models with an application to gasoline demand

Misspecified Linear Bandits

Dust source mapping using satellite imagery and machine learning models

Robust prediction and extrapolation designs for misspecified generalized linear regression models

عنوان ژورنال:

اشتراک گذاری